Inter-Phone and Inter-Word Distances for Confusability Prediction in Speech Recognition
نویسندگان
چکیده
In this work we investigate new inter-phone and inter-word distances and we apply them to predict if two words of the lexicon of an Automatic Speech Recognition (ASR) system are likely to be confused. The inter-word distance is calculated from an alignment between the phonetic transcriptions of the words by adding the distances between the aligned phones. We bring a new solution in which the inter-phone distance used for computing the inter-word distance is not the same used to compute the phonetic alignment. The first one is calculated between the acoustic models of the phones with a new formula that we propose. The second one is based on phonetic knowledge. We also use two different kinds of alignments: either with or without insertions and deletions. In order to evaluate the performances, we introduce a classical false acceptance/false rejection framework and the prediction Equal Error Rate (EER) was measured to be less than 2%.
منابع مشابه
Word confusability prediction in automatic speech recognition
A new method to predict if two words are likely to be confused by an Automatic Speech Recognition (ASR) system is presented in this paper. A new inter-word dissimilarity measure based on Dynamic Time Warping (DTW) is used to classify the word pairs as confusable or not confusable. Firstly, the phonetic transcriptions of the two words to compare are aligned using only phonetic information. After...
متن کاملWord confusability - measuring hidden Markov model similarity
We address the problem of word confusability in speech recognition by measuring the similarity between Hidden Markov Models (HMMs) using a number of recently developed techniques. The focus is on defining a word confusability that is accurate, in the sense of predicting artificial speech recognition errors, and computationally efficient when applied to speech recognition applications. It is sho...
متن کاملEnvelope-based inter-aural time difference localization training to improve speech-in-noise perception in the elderly
Background: Many elderly individuals complain of difficulty in understanding speech in noise despite having normal hearing thresholds. According to previous studies, auditory training leads to improvement in speech-in-noise perception, but these studies did not consider the etiology, so their results cannot be generalized. The present study aimed at investigating the effectiveness of envelope-b...
متن کاملModelling pronunciation variations in spontaneous Mandarin speech
Pronunciation in spontaneous Mandarin speech tends to be much more variable than in read speech. In current recognition systems, pronunciation dictionaries usually only contain one standard pronunciation for each word, so that the amount of variability that can be modelled is very limited. Most recent research work for modelling variations in spontaneous speech focuses on the lexicon level, whi...
متن کاملIntra-speaker variation and units in human speech perception and ASR
Research on speech perception and ASR has resulted several important advances in our understanding of speech variation: one is that speaker dependent variation is systematic, another is that inter-speaker and intra-speaker variation diverge in their root causes and characteristics. Therefore, a successful approach to one may not always transfer to the other. Intertalker variation, or indexical ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 33 شماره
صفحات -
تاریخ انتشار 2004